Antipattern: Use Dual-Purpose Foreign Key
Let’s investigate how using a foreign key to make polymorphic association affects the database.
We'll cover the following
Defining a Polymorphic Association#
To make Polymorphic Associations work, we must add an extra string
column alongside the foreign key on issue_id
. The extra column contains the name of the parent table referenced by the current row. In this example, the new column is called issue_type
and contains either Bugs
or FeatureRequests
corresponding to the names of the two possible parent tables in this association.
We can see one difference immediately: the foreign key declaration for issue_id
is missing. In fact, since a foreign key must specify exactly one table, using a Polymorphic Association means that we can’t declare this association in metadata. As a result, there is no enforcement of data integrity to ensure that the value in Comments.issue_id
matches a value in the parent table.
Likewise, no metadata ensures that the string in Comments.issue_type
corresponds to a table that exists in this database.
Querying a Polymorphic Association#
The issue_id
value in the Comments
table may occur in the primary key column of both parent tables, Bugs
and FeatureRequests
, or the value may occur in one parent table but be missing in the other parent table. It is, therefore, crucial to use the issue_type
correctly when joining the child table to the parent table. We must not match an issue_id
value to the FeatureRequests
table if it was intended to be matched to the Bugs
table.
For example, this will retrieve comments for a given bug by its primary key value 1234:
Although the previous query works if bugs are stored in the single table Bugs
, we run into a problem when Comments
is associated with both tables — Bugs and FeatureRequests. In SQL, we must specify all tables in a JOIN
; we cannot join Comments
to two separate tables and switch between them row by row according to the value in the Comments.issue_type
column.
To retrieve either a bug or a feature, given a specific comment, we need to run a query with an outer join to both parent tables. Only one of the parent tables will satisfy its join since part of the join condition relies on the value in the Comment.issue_type
column. An OUTER JOIN
means those fields from the table that do not match NULL
in the result set.
Non-object-oriented example#
In the example of Bugs
and FeatureRequests
, the two parent tables are meant to model related subtypes. Polymorphic Associations may also be used when the parent tables are completely unrelated to each other. For example, in an e-commerce database, both tables, Users
and Orders
, may be associated with Addresses
, as illustrated below.
Let’s create a table for the addresses as follows:
Mixing data with metadata
In this case, the Addresses
table contains a polymorphic column that names either Users
or Orders
as the parent table for a given address. Notice that we have to choose one or the other. We can’t associate a given address with both a user and an order, even an order placed by that user to ship merchandise to themselves.
Also, if a user has a shipping address as well as a billing address, we need some way to make this distinction in the Addresses
table. Likewise, all other parents need to note the special usage of addresses in the Addresses
table. These notes can propagate like weeds.